Insights into the genome of the ‘Loco’ Concholepas concholepas (Gastropoda: Muricidae) from low-coverage short-read sequencing: genome size, ploidy, transposable elements, nuclear RNA gene operon, mitochondrial genome, and phylogenetic placement in the family Muricidae

Background The Peruvian ‘chanque’ or Chilean ‘loco’ Concholepas concholepas is an economically, ecologically, and culturally important muricid gastropod heavily exploited by artisanal fisheries in the temperate southeastern Pacific Ocean. In this study, we have profited from a set of bioinformatics tools to recover important biological information of C. concholepas from low-coverage short-read NGS datasets. Specifically, we calculated the size of the nuclear genome, ploidy, and estimated transposable elements content using an in silico k-mer approach, we discovered, annotated, and quantified those transposable elements, we assembled and annotated the 45S rDNA RNA operon and mitochondrial genome, and we confirmed the phylogenetic position of C. concholepas within the muricid subfamily Rapaninae based on translated protein coding genes. Results Using a k-mer approach, the haploid genome size estimated for the predicted diploid genome of C. concholepas varied between 1.83 Gbp (with kmer = 24) and 2.32 Gbp (with kmer = 36). Between half and two thirds of the nuclear genome of C. concholepas was composed of transposable elements. The most common transposable elements were classified as Long Interspersed Nuclear Elements and Short Interspersed Nuclear Elements, which were more abundant than DNA transposons, simple repeats, and Long Terminal Repeats. Less abundant repeat elements included Helitron mobile elements, 45S rRNA DNA, and Satellite DNA, among a few others.The 45S rRNA DNA operon of C. concholepas that encodes for the ssrRNA, 5.8S rRNA, and lsrRNA genes was assembled into a single contig 8,090 bp long. The assembled mitochondrial genome of C. concholepas is 15,449 bp long and encodes 13 protein coding genes, two ribosomal genes, and 22 transfer RNAs. Conclusion The information gained by this study will inform the assembly of a high quality nuclear genome for C. concholepas and will support bioprospecting and biomonitoring using environmental DNA to advance development of conservation and management plans in this overexploited marine snail. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-023-09953-7.


Background
Among gastropod molluscs, known because of their species-richness and eco-morphological disparity ( [1] Aktipis et al. 2018), the Peruvian 'chanque' or Chilean 'loco' Concholepas concholepas (Bruguière, 1789) represents an interesting case of shell form evolution-it exhibits a flattened rather simple limpet-like shell in a family characterized by spectacularly ornamented spiral shells ([2] Vermeij 2017).Concholepas concholepas is also an economically, ecologically, and culturally important muricid (Muricidae) heavily exploited by artisanal and commercial fisheries along most of its range of distribution in the temperate southeastern Pacific Ocean ( [3] Manriquez and Castilla 2018).
The species inhabits cold and temperate waters in the southwestern coast of South America, from Lobos de Afuera in Peru to Cape Horn in Chile (Fig. 1).Concholepas concholepas is also present in the Juan Fernandez Archipelago, off the central coast of Chile.This carnivorous snail lives in intertidal and shallow subtidal rocky habitats among holdfasts in kelp forests or in encrusting communities composed of mussels, tunicates, and/or barnacles ( [4] Stotz et al. 2003, [3] Manríquez and Castilla 2018).Concholepas concholepas is considered a keystone species in rocky shores; it controls (via consumption) the abundance of the competitive dominant mussel Perumytilus purpuratus and thus liberates primary space for barnacles and algae to grow.Overall, the diversity of benthic primary-substratum users increases in the presence of C. concholepas ([5] Castilla 1999).Adult specimens of the edible C. concholepas can reach a maximum shell length of 150-160 mm ( [6] Wolff 2008) and have been heavily targeted together with juveniles by subsistence and artisanal fisheries for at least 60 years in the southeastern Pacific coast ( [3] Manriquez and Castilla 2018).Currently, C. concholepas is one of the main invertebrates targeted by small-scale fisheries with territorial use rights in Chile but it has been harvested by coastal human societies for more than 8-10 thousand years in Peru and Chile ( [7] Jerardino et al. 1992, [8] Reitz et al. 2017, [9] Santoro et al. 2017).
Given its ecological role and commercial importance, the life history and population dynamics of C. concholepas are relatively well studied ( [10] Molinet et al., 2005, [3] Manriquez & Castilla 2018, and references therein) and the species has been used as a model system during the last decades in research focusing on population and community ecology, ecophysiology, behavioral ecology, genetics and molecular biology, and biogeography, among others (e.g.[3] Manriquez and Castilla 2018 and references therein).Unexpectedly, despite the ecological relevance, cultural significance, and commercial value of C. concholepas, only a few genomic resources have been developed for this species (i.e., [11][12][13] Cárdenas et al. 2007Cárdenas et al. , 2011Cárdenas et al. , 2016, [14] , [14] Núñez-Acuña et al. 2013, [15] Gallardo-Escárate et al. 2013, [16] Détrée et al. 2017).Advancing genomic resources in this iconic snail is of utmost importance to continue improving the understanding of its ecology and key role in its community while also supporting conservation and fisheries management plans.
The present study forms part of a comprehensive project to develop genomic resources for C. chocholepas and other marine organisms that are intensively targeted by commercial and artisanal fisheries in the temperate Southeastern Pacific Ocean.Here, we have used low-coverage short read next generation sequencing and profited from a set of bioinformatic pipelines designed to retrieve biological information from lowcoverage datasets to i. estimate the genome size and ploidy of C. chocholepas using an in silico k-mer strategy, ii.calculate the content of transposable elements in the nuclear genome of C. chocholepas, iii.annotate and characterized those transposable elements, iv.assemble the 45S rRNA nuclear DNA operon that encodes the large and small nuclear rRNA genes (18S or ssrDNA, 28S or lsrDNA), the 5.8S rDNA gene, two internal transcribed spacers (ITS1 and ITS2), and two external transcribed spacers (5′ ETS and 3′ ETS), v. assemble, annotate, and describe in detail the mitochondrial genome (mitogenome) of C. concholepas, and explore the position of C. concholepas among muricid gastropods based on the phylogenetic signal provided by translated protein coding genes.These new genomic resources will guide a chromosome-level genome assembly of C. chocholepas and will eventually support fisheries management and conservation strategies

Genome size and ploidy estimation in Concholepas concholepas
Using an in silico k-mer approach, the haploid genome size estimated for C. concholepas varied between a minimum of 1,825,342,588 bp (1.83 Gbp, with kmer = 24) and a maximum of 2,327,023,338 bp (2.32 Gbp, with kmer = 36).No clear trend of concomitant increases in genome size with k-mer word size was observed in our analysis.Genome size (GS) estimated using either flow cytometry, Feulgen densitometry, bulk fluorometric assay, or biochemistry analysis is known for only 13 muricid gastropods (Animal Genome Size Database [http:// www.genom esize.com/] - [17] Gregory 2021 [consulted on 9 8 2023]) and ranges between a minimum of 2.04 Gb in the Southern oyster drill Thais haemastoma and 5. 75  ).Overall, our estimates of GS for C. concholepas using an in silico k-mer approach are within the range observed for gastropods and very similar to that reported for muricid snails.
Using a second in silico k-mer analysis on the relative abundance of heterozygous k-mer pairs with the program Smudgeplot, the nuclear genome of C. concholepas was determined as diploid (Fig. 2).Diploidy is often assumed in muricids and other gastropods, although studies on ploidy are rare in this clade ( [22] Lopez et al. 2019).In other gastropods families, species with different ploidy are found within the same family or genus (i.e., in the freshwater snail Bulinus truncatus / tropicus species complex- [23] Yusuf et al., 2017).Chromosome evolution studies in the family Muricidae are lacking.We argue that a combination of low-coverage sequencing and k-mer spectra analyses can advance our understanding of ploidy evolution and environmental correlates in marine snails and other marine invertebrates.

Transposable elements in Concholepas concholepas
The pipeline RESPECT estimated that the repetitive genome content of C. concholepas ranged from a minimum of 49% (with kmer = 51) to a maximum of 66% (with kmer = 21).In our analysis, a trend of decreasing repetitive genome content was observed with increases in k-mer word size.In general, between half and two thirds of the nuclear genome of C. concholepas is composed of transposable elements.Repetitive genome content varies considerably among gastropods and ranges from 11.40% in the freshwater snail Pomacea caniculata ( Overall, repetitive content in the genome of C. concholepas is within the range observed for gastropods and is similar to that reported for the cofamilial Rapana venosa.The size of and the high proportion of transposable elements in the nuclear genome of C. concholepas suggests that chromosome conformation capture techniques (i.e., Hi-C) in addition to short and long-reads (i.e., Oxford Nanopore Technology and/ or Pacific Biosciences) will be necessary to assemble a chromosome-level genome in this gastropod.
The program dnaPipeTE estimated that 34.19% of the genome in C. concholepas represented repetitive elements, a value lower than that reported by RESPECT.Also, DnaPipeTE reported a relatively high portion of repetitive elements (i.e., 47.99%) as 'unknown'; these repetitive elements were not annotated (not assigned to any known family) using the Protostomia database of transposable elements from the Dfam consortium.Taking into account only those repetitive elements that were annotated by DnaPipeTE, the most common repetitive elements were classified as Long Interspersed Nuclear Elements (LINEs, 15.12%) and Short Interspersed Nuclear Elements (SINEs, 7.18%), which were more abundant than DNA transposons (DNA, 1.79%), tRNA (1.13%), simple repeats (1.01%) and Long Terminal Repeats (LTR, 0.73%).Less abundant repeat elements included rRNA DNA (0.28%) and Satellite DNA (0.21%), among a few others (Fig. 3).In other gastropods with assembled genomes in which the 'repeatome' has been characterized, the proportion of unclassified repetitive elements is usually low, in disagreement with our observations.For example, only 0.16% of the genome content corresponds to unclassified repetitive elements in the cofamilial Rapana venosa ( DnaPipeTE also estimated the repetitive elements 'landscape' in C. concholepas that exhibited a leptokurtic distribution (Fig. 3).However, no obvious 'peaks' , either in the recent or distant past, were observed in the repetitive elements landscape that could be interpreted as ancient bursts.Still, the analysis suggests that transposable elements  ).Future studies focusing on transposable elements activity will permit the exploration of the conditions driving the dynamics of the 'repeatome' in the species-rich order Gastropoda.Furthermore, Casacuberta and González (2013) [29] have argued that repetitive elements can influence the capability of their hosts to respond to environmental challenges.Whether repetitive elements affect the ability of molluscs and other marine invertebrates to pervasive global change challenges remains to be addressed.Importantly, phylogenetic relationships within and among the different mollusc lineages, including gastropods, have been explored using fragments of the 18S and 28S ribosomal genes for more than 20 years now ( [30] Colgan et al. 2007, [31] Zou et al. 2011).We have shown here that low-coverage sequencing data can be used to assemble the complete 45S rRNA DNA operon of C. concholepas.The recovery of additional 45S rRNA DNA sequences in other muricids using bioinformatic tools tailored for low-coverage sequencing datasets can be used to understand the organization and evolutionary dynamics of this repetitive element in molluscs.

The mitochondrial genome of Concholepas concholepas
The pipeline GetOrganelle assembled the mitochondrial genome of C. concholepas (OR506260) with a k-mer-and base-coverage equal to 125 × and 521x, respectively.The mitochondrial genome of C. concholepas is 15,449 bp long and encoded 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, and two ribosomal RNA genes (12S ribosomal RNA [rrnS] and 16S ribosomal RNA [rrnL]) (Table 1 Fig. 4).The mitochondrial genome of C. concholepas also contains a relatively short noncoding putative Control Region (CR) 249 bp long.Mitochondrial gene order in C. concholepas was identical to that previously reported for other species of gastropods belonging to the family Muricidae (  S1).Each amino acid in the mitochondrial PCGs of C. concholepas was encoded by a minimum of two or a maximum of 8 codons with the former being more typical (12 out of 20 amino acids) (Fig. 5).RSCU values also indicated that all the (synonymous) codons for the same amino acid were not used equally in the mitochondrial PCGs of C. concholepas.Specifically, codons ending in A or T were overrepresented compared to codons ending in C or G (Fig. 5).Studies on the codon usage of mitochondrial PCGs have not been conducted before in other muricid gastropods.However, codon usage biases in mitochondrial PCGs have been invariably reported in other marine invertebrates, including molluscs (e,g., in bivalves - [36] Sun and Gao 2017) and gastropods (e.g., in the family Strombidae - [37] Li et al. 2022).The AT-rich nucleotide usage of the studied mitochondrial genome is likely a reflection of the codon usage bias reported herein for the mitochondrial PCGs of C. concholepas.The conditions explaining the non-random use of codons in mitochondrial PCGs are not well understood and several factors have been proposed to drive genome-wide or mitogenomic codon usage biases i.e., mutational bias, selection for optimizing the translation process by tRNA abundance, and harsh environmental conditions, among others (see [38] Jia and Higgs 2008 and references therein).
The w ratios calculated for all the PCGs in the mitochondrial genome of C. concholepas were lower than 1 (Table 2), implying that all of these genes experience purifying selection.The highest w values were observed in atp8, cox3, and all PCGs belonging to the nad family except nad1 indicating that the aforementioned genes (other than nad1) are experiencing the weakest selective pressures in the mitochondrial genome of C. concholepas.In turn, the lowest w values were observed in cox1, cox2, cob, nad1, and atp6, indicating that these genes are experiencing the strongest selective pressure compared to the remainder PCGs.Selective pressures analyses have not been conducted before in any representative of the family Muricidae.However, our results agree with those from studies in other gastropod clades showing that all PCGs are under purifying selection (e.g., in the family Neritidae [ [39] Feng et al. 2021] and Strombidae [37] [Li et al. 2022], among others).The species richness and ecological disparity characteristic of the Muricidae suggest that this family might be a suitable model system to understand the effect of environmental conditions on the adaptive evolution of mitochondrial protein coding genes.
In the mitochondrial genome of C. concholepas, the 22 tRNAs varied in length between 65 bp (trnS2) and 70 bp (trnL1).All tRNAs exhibited a typical 'cloverleaf ' secondary structure except trnS1 and trnS2, which lacked the DHU arm and loop, respectively (Fig. 6).To the best of   dinucleotide-motif microsatellite repeats (AA and TT, each repeated 3 times) were reported by the web server Microsatellite Repeat Finder.Lastly, the web server RNA-Fold predicted a single optimal secondary structure with a minimum free energy of -212.00 kcal/mol (free energy of the thermodynamic ensemble = -212.34kcal/mol) that formed a single 'hairpin' bearing a long stem and very short loop (Supplementary Materials Fig. S1).The long stem was the consequence of a perfect 111 bp long palindromic motif (5'-AGC CAG CAC TCA CTC CAA GAG TGC TGG CCA AAG GGC TCC GCC GAG CGA ACC TGA AAT TTT ATA GTT TTA GAG GCA CAG AGC CAA AAT TAT CTA TTT TTT GCT TAA TTT CTA-3' .Importantly, the non-coding putative CR in the mitochondrial genome of muricids and other gastropods is short and can be considered extremely truncated compared to that of other marine invertebrates (see [33

Phylomitogenomics of the family Muricidae
In the ML phylogenetic analysis (48 terminals, 3,697 characters, 1,066 parsimony-informative sites), C. concholepas together with all other representatives of the family Muricidae used in this study clustered together   In the subfamily Rapaninae, C. concholepas had a late branching position and formed a monophyletic clade with a second mitochondrial genome of C. concholepas already available in GenBank (JQ446041).We note that this previously available mitochondrial genome of C. concholepas is a chimeric molecule assembled using 'noisy' (= high error-rate) pyrosequencing DNA reads, transcriptomic data from 50 specimens, and Sanger sequencing ( [14] Núñez-Acuña et al. 2013).
There is a long history of strong interest in this large globally distributed family, and the family Muricidae was mainly established based on shell and radular characteristics but updated with molecular phylogenetic results (e.g.[

Conclusion
In summary, we have produced a set of genomic resources for the Chilean 'loco' or Peruvian 'chanque' C. concholepas, a species of considerable ecological, commercial, and cultural importance in the southeastern Pacific Ocean that is experiencing heavy fishing pressure and major environmental challenges (i.e., due to pollution, ocean acidification, and increased temperature).This is the first study focusing on this muricid mollusc that has profited from a set of bioinformatics tools to recover important biological information from low-coverage short-read NGS datasets.We have calculated the size and ploidy of the nuclear genome and estimated its transposable elements content.Also, we have discovered, annotated, and quantified these repetitive elements.We have assembled and annotated the 45S rDNA RNA operon and mitochondrial genome.Lastly, we have confirmed the phylogenetic position of C. concholepas in the muricid subfamily Rapaninae based on translated PCGs.The new information generated by this study will inform the assembly of a high quality nuclear genome for C. concholepas, is expected to support bioprospecting and biomonitoring using state-of-the-art genomic techniques (eDNA) in this species, and will contribute to improve the understanding of the genomic mechanisms related to the

Specimen, DNA extraction, library preparation and sequencing
A frozen specimen of C. concholepas (caught in Chile) was bought from a local supermarket in Raleigh, North Carolina, USA and transported to Clemson University (CU), Clemson, South Carolina, USA.The specimen was deposited at CU's Crustacean Collection (accession number CU-CC-2022-15-05).In the laboratory, a small tissue fragment (0.75 cm 3 ) was dissected from the foot and preserved in 95% ethanol for shipping to Iridian Genomes, Inc. (Bethesda, MD) where genomic DNA (gDNA) extraction and next generation sequencing (NGS) were conducted.gDNA was extracted from the sample using the DNeasy Blood and Tissue Kit (Qiagen, Germany) following the manufacturer's protocol.Then, library preparation was constructed using the Illumina TruSeq kit following the manufacturer's protocol.NGS was performed in a Illumina HiSeq X Ten system (Illumina, San Diego, CA, USA) using a 2 × 150 cycle.A total of 72,006,895 pairs (PE) of reads were produced by Iridian Genomes and were deposited in the short read archive (SRA) repository (Bioproject: PRJNA996197; BioSample: SAMN36527401; SRA accession number: SRR25338493) at NCBI's GenBank.

Genome size and ploidy estimation in Concholepas concholepas
To estimate genome size in C. concholepas using an in silico k-mer strategy, we first removed Illumina adapters and low quality sequences (Phred scores < 20) from the dataset (raw Illumina reads) using the program fastp v.0.To estimate ploidy in C. concholepas using an in silico k-mer strategy, the k-mer frequency distribution generated with the pipeline KMC using word size equal to 21 was submitted to the program Smudgeplot v0.2.5 ([51] Ranallo-Benavidez et al. 2020).After visual examination of k-mer coverage in the web server GenomeScope (http:// qb.cshl.edu/ genom escope/ genom escop e2.0- [51] Ranallo-Benavidez et al. 2020), we selected high coverage k-mers between 20 × and 120 × for the analysis of heterozygous k-mer pairs in Smudgeplot.

Transposable elements in the genome of Concholepas concholepas
First, we mapped the set of clean and decominated PE reads to a newly assembled mitochondrial genome of C. concholepas (see below) with the program HISAT2 v2.

Nuclear ribosomal cassette in Concholepas concholepas
We assembled the 45S rRNA DNA of C. concholepas using the program TAREAN (tandem repeat analyzer- [56]

Mitochondrial genome assembly and characterization in Concholepas concholepas
We used the program GetOrganelle v1.The web server Chloroplot (https:// irsco pe.shiny apps.io/ Chlor oplot/-[66] Zheng et al. 2020) was used to render the studied mitochondrial genome as a circular map.Nucleotide usage for the entire mitochondrial genome was estimated using the software MEGA X.The codon usage of all PCGs was calculated using the codon usage tool in the web server Sequence Manipulation Suite (https:// www.bioin forma tics.org/ sms2/ codon_ usage.html- [67] Stothard et al. 2000).Relative synonymous codon usage (RSCU) was calculated with the tool EZcodon as implemented in the web server EZmito (http:// ezmito.unisi.it/- [68] Cucini et al. 2021).
We conducted an analysis of selective pressures for each mitochondrial PCG.For this purpose, the software KaKs_calculator 2.0 ([69] Wang et al. 2010) was used to calculate the number of nonsynonymous substitutions per nonsynonymous site dN, the number of synonymous substitutions per synonymous site dS, and the ratio ω = dN/dS for each PCG.During calculations, we used the γ-MYN model to account for mutation rate variance along the studied sequences and the cofamilial Rapana venosa as an outgroup (GenBank accession number KM213962).The observed ω ratio is expected to be equal to 1, < 1, or > 1, if a particular PCG is exposed to neutral selection, purifying (negative), or diversifying (positive) selection, respectively.

Phylogenetic position of Concholepas concholepas in the family Muricidae
We tested the phylogenetic position of C. concholepas in the family Muricidae based on the phylogenetic signal from translated PCGs.A maximum likelihood (ML) phylogenetic analysis was conducted using the newly assembled mitochondrial genome of C. concholepas, a second mitochondrial genome of C. concholepas already available in GenBank (JQ446041), and those of 32 cofamilial species with mitochondrial genomes available in GenBank.The analysis used 11 other species belonging to other neogastropod families as outgroups.First, each set of PCG nucleotide sequences was translated to amino acids and then aligned with the program Clustal Omega ( [73]

Fig. 2
Fig. 2 Relationship between coverage of heterozygous k-mer pairs and normalized minor k-mer coverage in Concholepas concholepas [20] Song et al. 2018).Interestingly, in the latter species, LINEs were the most common (39.636% of the assembled genome) but SINEs were the rarest (6.09Mb, 0.27%) among repetitive elements ([20] Song et al. 2023).No 'repeatome' analysis is available for the second muricid with an assembled genome, Stramonita haemastoma ([21] Farhat et al. 2023).Expansion of repetitive elements has been suggested to account for genome size increases in both vertebrates and invertebrates ([27] Helmkampf et al. 2019).Whether or not expansion of mobile genetic elements explain genome size variance in gastropods and other molluscs remains to be addressed.
have a high turnover in the nuclear genome of C. concholepas.By contrast, in the deep-sea limpet Bathyacmaea lactea repetitive elements have undergone long-lasting activity in the deep-past (i.e., until the last 10 Mya) that included two concentrated TE expansions ([28] Liu et al. 2020).To the best of our knowledge, no studies have examined the transposable elements landscape in gastropods other than in C. concholepas (this study) and Bathyacmaea lactea ([26] Liu et al. 2020

Fig. 4
Fig. 4 Circular map of the mitochondrial genome of Concholepas concholepas.Photograph credit: Gustavo Duarte (published with permission) ] Yu et al. 2023).Detailed analyses of this non-coding region in gastropods mitochondrial genomes is rare ([33] Yu et al. 2023).We argue that additional studies characterizing this region in detail will help us understand mitochondrial genome replication and translation in gastropods.

Fig. 5
Fig. 5 Relative synonymous usage in the 13 protein coding genes encoded in the mitochondrial genome of Concholepas concholepas into a single fully supported clade (bootstrap value [bv] = 100) (Fig.7).Within the monophyletic family Muricidae, fully supported subfamilies included the Ergalataxinae, represented by 3 genera and 6 species in our analysis, Ocenebrinae, represented by 2 genera and 6 species, Muricinae, represented by 3 genera and 4 species, and Rapaninae, represented by 10 genera and 17 species.The family Padoludinae, represented by a single species, Boreotrophon candelabrum, in our analysis, was well supported (bv = 90) as a taxon sister to the subfamily Muricinae.In turn, Coralliophila richardi (subfam.Coralliophilinae) have an early branching position in the family Muricidae; it was sister to all other studied muricids, in line with that reported by[34] Harasewych et al. (2022).In general, most of the relationships among subfamilies in muricids recovered by our ML analysis agree with those previously reported by [34] Harasewych et al. (2022) and [33] Yu et al. (2023), which used a smaller set of mitochondrial genomes for phylogenetic reconstruction.

Fig. 6
Fig. 6 Secondary structure of the tRNA genes encoded in the mitochondrial genome of Concholepas concholepas

Fig. 7
Fig. 7 Maximum likelihood phylogenetic hypothesis for the family Muricidae and phylogenetic placement of Concholepas concholepas.The tree was retrieved using the phylogenetic signal provided by the translated mitochondrial protein coding genes.The robustness of the ML tree topology was ascertained by 1,000 boot Numbers above branches near nodes represent bootstrap pseudoreplicates of the tree search.Photograph credit: Gustavo Duarte (published with permission) 2.1 ([52] Kim et al. 2019) and used only those reads that did not map to the mitochondrial genome (n = 61,865,612 PE reads) for the discovery, annotation, and quantification of repetitive elements in the nuclear genome of C. concholepas using the program dnaPipeTE v1.4c ( [53] Goubert et al. 2015, [54] Goubert 2022).Using low-coverage Illumina datasets, DnaPipeTE assembles repetitive elements and then annotates them based on homology with the program Repeat-Masker (www.repea tmask er.org).Finally, DnaPipeTE maps a random sample of the reads onto the assembled repetitive elements to quantify their abundance.We executed Dna-PipeTE with two iterations of the assembler Trinity using independent read sets, sampled at 0.25X, each time ([54] Goubert et al. 2022) and the Protostomia-specific database of transposable elements from the Dfam consortium (https:// www.dfam.org/-[55] Hubley et al., 2016).Lastly, we retrieved the transposable elements landscape of C. concholepas which dnaPipeTE estimated by calculating and plotting the (blastn) divergence between transposable elements copies in the genomes (estimated from reads) and their respective assembled consensus sequences ([55] Goubert 2022).
6.4 ([62] Jin et al., 2020) to de novo assemble the mitochondrial genome of C. concholepas using the totality of the raw Illumina reads.The mitochondrial genome of the cofamilial Rapana venosa (with GenBank accession number MZ435265) was used as a 'seed' during the assembly run that utilized k-mer sizes of 21, 55, 85, and 115 ([62] Jin et al., 2020).The pipeline MITOS2 (http:// mitos2.bioinf.uni-leipz ig.de-[63] Donath et al. 2019) was used for the in silico annotation of the newly assembled mitochondrial genome and manual curation (i.e., readjustments to the start and stop codons of the protein coding genes [PCGs]) was conducted using the software MEGA X ([64] Kumar et al. 2018) and the web server translation tool ExPASy (https:// web.expasy.org/ trans late/-[65] Gasteiger et al. 2003).
Sievers and Higgins, 2014).Next, we eliminated poorly aligned regions with the program trimAl ([74] Capella-Gutiérrez et al., 2009) in each PCG alignment and used the program ProtTest ([75] Darriba et al., 2011) to partition the dataset and select the best fitting models of sequence evolution for each partition.Lastly, a ML analysis was conducted in the web server IQ-TREE version 1.6.10([76] Nguyen et al., 2015) with the concatenated but partitioned PCG amino acid alignment.The robustness of the ML tree topology was assessed by 1,000 bootstrap iterations of the observed data as in [58] Tucker et al. (2023).

Table 1
Mitochondrial genome of Concholepas concholepas.Arrangement and annotation

Table 2
Selective pressure analysis of the protein coding genes (PCGs) in the mitochondrial genome of Concholepas concholepas.K A /K S values were calculated using the γ-MYN model using the mitochondrial genome of Rapana venosa as outgroup 43] Barco et al. 2010, [44] Claremont et al. 2013).These advances improved the understanding of plasticity and convergence in some shell characters, and while the taxonomic affinities of many species remain enigmatic, Concholepas is placed confidently within the subfamily Rapaninae, which is confirmed by our analyses herein.Researchers working on Muricidae already produced a mitogenome phylogeny based on 23 muricid species but with a smaller set of mitochondrial genomes for phylogenetic reconstruction compared to this study ([33] Yu et al. 2023).Data are lacking to enable phylogenomic analyses with strong taxon sampling for molluscs which can bias results ([45] Sigwart et al. 2021) and it is important to continue to expand taxon sampling, especially for unusual morphologies like C. concholepas.